A Novel Classical Music Recommendation System Using User Facial Expression

Authors: V Saravana Kumar, K Kranthi Kumar, G Sathish, R Sumana, A Srinidhi

DOI Link: https://doi.org/10.22214/ijraset.2023.52832

Abstract

In today\'s world, feelings play an important part in many aspects of life. The spectrum of human feelings, whether or not those sensations are verbalised, serves as the foundation for emotions. A diverse range of emotional expressions can be used to convey a person\'s sense of individuality. It is possible to infer a person\'s state of mind simply by observing their facial expressions. This project aims to determine people\'s feelings based on images of them and then match those feelings with suitable musical selections. The majority of today\'s most popular music discovery tools make use of a social or content-based recommendation engine of some kind. However, a listener\'s taste in music is dependent on a variety of factors in addition to their previous listening habits and the subject matter of the songs they hear. The feelings of other users are also important.

Introduction

I. INTRODUCTION

Wearable computing is the research and development of computing and sensing systems that can be worn on the body and which exploit a novel kind of human-computer interface by way of a permanently attached physiological component. These systems can be worn on a variety of different parts of the body, including the head, the chest, the arms, and the legs. Both the number of people who use wearable computers and the range of applications for which they are being put to use are expanding at a rapid rate.

The industries of medicine, physical fitness, geriatrics, special education, public transit, economics, video games, and music are all impacted by this development. Algorithms are utilised by recommendation engines in order to discover which aspects of a huge database include the information that is most pertinent to a particular user. By examining a user's prior clicks and browsing history in order to detect patterns in the data, recommendation engines are able to personalise their recommendations to each particular user. In most cases, recommender systems do not take into account the facial expressions and emotions of human users. Emotions, on the other hand, have a significant impact on the way people live their lives on a daily basis.

Computers have to take into account the feelings of their human conversation partners because there are so many different applications for them. These include human-robot interaction, computer-assisted instruction, emotionally intelligent interactive games, neuromarketing, and socially intelligent software app development. The implementation of emotion detection technologies has proven to be beneficial to both speech analytics and the analysis of facial expressions. However, because of the nature of humans, it may be difficult to properly detect emotions based just on data collected from the speech or facial expressions. The use of a person's physiological signals, as opposed to depending on their facial expressions, provides a more dependable window into the mental and emotional condition of a person. We believe that by combining emotion detection algorithms with wearable computing devices, we will be able to improve the accuracy of the music suggestions that are generated by the music recommender system. This is what prompted our endeavour. In our prior research, we looked into the possibility of identifying emotions based solely on GSR signals. In the course of this investigation, one of our primary objectives is to investigate the feasibility of integrating emotion identification algorithms with mobile computing hardware. Earlier, we looked into the possibility of recognising emotions based just on GSR signals. Within the scope of this study, we are making use of PPG to improve signal quality in order to develop music recommendation systems. We also describe a strategy for identifying emotions that is based on merging the many data sets that are currently accessible. In addition to taking into consideration the user's demographic information, the method that has been developed for recommending music to a wearable device also takes into account the user's current state of mind. Using data from GSR and PPG, we have discovered some interesting findings in the area of emotion prediction.

II. RELATED WORKS

According to Ekman and colleagues' research, there are seven primary classifications of facial expressions: anger, disgust, joy, fear, surprise, and sadness, as well as neutral. [7]. This indicates that people of diverse ages, social backgrounds, and racial origins were able to comprehend these sentiments and respond in an acceptable manner. However, it is possible that relying just on data collected from facial expressions will not be sufficient when attempting to identify human emotions.

The use of a person's physiological signals, as opposed to depending on their facial expressions, provides a more dependable window into the mental and emotional condition of a person. Utilising physiological markers such as respiration, cardiac output, galvanic skin resistance/conductivity, etc. has allowed for this deficiency in emotion recognition and tracking activities to be remedied. Now, the tasks may be completed successfully.

The standard algorithms for making recommendations use either content-based or collaborative filtering methods, neither of which takes into account the emotional state of the user. [11], [12]. It may be possible to improve the performance of recommendation engines by taking into account the emotional states of people. Shin et al. developed a computerised system that can make suggestions for music that is relaxing. [8]. The technology utilised a portable, cordless PPG finger sensor as its key component. Nirjon et al. proposed developing a music recommender system for mobile devices that was based on biosensors and was also aware of the situation. [9]. The music suggestion system developed by Liu and colleagues takes into account the user's preferences as well as their heart rate.

We present a music recommendation system that, while making suggestions, takes into account the user's current state of mind. The choices made by the system are largely informed not just by the tastes of the user but also by the likely emotional consequences of the music that is being suggested. The emotional state of the user is taken into account both before and after the algorithm makes a song recommendation for the user. The GSR and PPG of the user are continuously monitored by the framework, which looks for any changes in the user's emotional state.

In a previous article [6], we presented a method of emotion detection that relied solely on GSR scans as its data source. In this study, we present a method for identifying emotions from data fusion-based music recommendation engines by augmenting signals using PPG. This method was developed as part of this research. Taking into account the emotional state of consumers is the focus of our approach, which is an effort to improve music recommendation algorithms.

III. PROPOSED METHODOLOGY

Personal Data Storage (PDS), which shifts away from a service-centric to a user-centric perspective, has set off a huge paradigm shift in people's access to and management of their own data. This transformation is being brought about by the fact that PDS moves away from a service-centric to a user-centric perspective. Users of personal data stores, often known as PDSs, have the ability to gather all of their information in a single, safe location. After being connected, the data can then be accessed by the appropriate analytic tools, shared with the appropriate individuals, and controlled by the appropriate individuals.

For the purposes of testing and training the system, we have collected a total of 28 test instances consisting of images depicting a range of emotions. In actual practise, a dataset is put to use for a variety of applications in addition to training. It is usual technique, while testing the quality of a trained model, to divide up a single, previously processed training dataset into several smaller subsets. This is done in order to test the quality of the trained model. Due to this reason, testing datasets are typically saved in a separate location than production data. A validation dataset is strongly suggested to prevent training your algorithm on the same kind of data, which might lead to biased predictions but is technically unnecessary. Training your algorithm on the same kind of data can be avoided by using a validation dataset.

Algorithm: A novel approach of music recommendation system with user’s facial expression.

Input: Image

Output: List of the songs based on the emotion present in the image.

Begin

1.Upload image as input

2.Data Pre-processing

3.Performing face detection and extracting the facial region from the image

a.detection_model_path='haarcascade_frontalface_default.xml'

b.face_detection=cv2.CascadeClassifier(detection_model_path)

c.emotion_model_path='_mini_XCEPTION.106-0.65.hdf5'

d.emotion_classifier=load_model(emotion_model_path, compile=False)

4.Facial Emotion recognition and mapping the expression with the emotions

a.Region of Interest (ROI) extraction.

b.Extraction of features

c.Mapping to emotions

d.EMOTIONS= ["angry","disgust","scared", "happy", "sad", "surprised","neutral"]

5.Detecting the Emotion from the list of emotions.

6.Retrieving the list of songs or tracks associated of different genres

7.Rank the songs based on their relevance to the detected emotion

8.Filter the song list based on the relevance

9.Print the recommended songs to the user as a list.

End

It is common practise to select only a subset of the relevant features (variables or predictors) that will be used in a model in order to improve its overall performance and ease of interpretation. The features that are selected should have a high degree of independence from one another, be relatively free of noise and redundancy, and have a strong connection with the variable that is the focus of the analysis.

Feature selection, which decreases the number of features that are incorporated into the model, can have a positive impact on the model's performance as well as its overfitting and its interpretability. Methods such as filtering, wrapping, and embedding are only a few examples of the feature selection procedures that are available.

These techniques achieve variable levels of success over a wide variety of data and model types, and their level of computational complexity varies as well.

This is a crucial step in the process of developing a model for machine learning. When applying a model to data that isn't initially clean or in the format required by the model, it's possible that the model will produce incorrect results. During the pre-processing phase, we modify the data so that it meets our requirements. It is utilised for the purpose of dealing with the ambiguity, redundancy, and missing values that are contained within the dataset. The term "data pre-processing" refers to a broad category of processes that include the likes of data import, data splitting, attribute scaling, and many others. It is vital to do pre-processing on the data in order to improve the precision of the model.

In order to distinguish distinct emotions, such as happy, sad, angry, or neutral, convolutional neural networks (CNN) are trained to train a deep learning model to categorise images or sequences of data portraying human faces or voices into various emotions. These emotions include happy, sad, angry, and neutral.

IV. RESULTS

Given the size of the dataset, there is no incentive to construct a more intricate model, which would involve the addition of more layers. However, the results of studies might be improved by lowering the minimal delta in the model's halting function. This might be done. The incorporation of data from speakers of various languages and cultures could allow for the development of a model that is more applicable across the board. This model will benefit even more from the presence of stuttering speech as well as varying speech intensities, both of which are common in dementia patients. It is challenging to make direct comparisons between our model and other algorithms since the audio recordings that make up the dataset contain some artificial noise.

Conclusion

In this piece of study, a model for providing music recommendations was developed with the help of facial expressions of various emotions. In this study, we propose that the development of an emotive music recommendation system could benefit from the application of facial recognition technology. Any kind of mental or emotional pressure can be alleviated via the power of music. As a result, the system that is being presented offers a face-based emotion detection system that is able to determine an individual\'s emotional state and reply with the appropriate musical accompaniment.

References

[1] S. Jhajharia, S. Pal, and S. Verma, “Wearable computing and its application,” Int. J. Comp. Sci. and Inf. Tech., vol. 5, no. 4, pp. 5700– 5704, 2014. [2] K. Popat and P. Sharma, “Wearable computer applications: A feature perspective,” Int. J. Eng. and Innov. Tech., vol. 3, no. 1, 2013. [3] P. Melville and V. Sindhwani, “Recommender systems,” in Encyc. of mach. learn. Springer, 2011, pp. 829–838. [4] N. Sebe, I. Cohen, T. S. Huang et al., “Multimodal emotion recognition,” Handbook of Pattern Recognition and Computer Vision, vol. 4, pp. 387– 419, 2005. [5] R. W. Picard, E. Vyzas, and J. Healey, “Toward machine emotional intelligence: Analysis of affective physiological state,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 10, pp. 1175–1191, 2001. [6] D. Ayata, Y. Yaslan, and M. Kamasak, “Emotion recognition via galvanic skin response: Comparison of machine learning algorithms and feature extraction methods,” IU J. of Elect. & Elect. Eng., vol. 17, no. 1, pp. 3129–3136, 2017. [7] P. Ekman, R. W. Levenson, and W. V. Friesen, “Autonomic nervous system activity distinguishes among emotions.” Am. Assoc. for Adv. of Sci., 1983. [8] I.-h. Shin, J. Cha, G. W. Cheon, C. Lee, S. Y. Lee, H.-J. Yoon, and H. C. Kim, “Automatic stress-relieving music recommendation system based on photoplethysmography-derived heart rate variability analysis,” in IEEE Int. Conf. on Eng. in Med. and Bio. Soc. IEEE, 2014, pp. 6402–6405. [9] S. Nirjon, R. F. Dickerson, Q. Li, P. Asare, J. A. Stankovic, D. Hong, B. Zhang, X. Jiang, G. Shen, and F. Zhao, “Musicalheart: A hearty way of listening to music,” in Proc. of ACM Conf. on Emb. Netw. Sens. Sys. ACM, 2012, pp. 43–56. [10] H. Liu, J. Hu, and M. Rauterberg, “Music playlist recommendation based on user heartbeat and music preference,” in Int. Conf. on Comp. Tech. and Dev., vol. 1. IEEE, 2009, pp. 545–549.

Copyright

Copyright © 2023 V Saravana Kumar, K Kranthi Kumar, G Sathish, R Sumana, A Srinidhi. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET52832

Publish Date : 2023-05-23

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here